BEDTools: a flexible suite of utilities for comparing genomic features

نویسندگان

  • Aaron R. Quinlan
  • Ira M. Hall
چکیده

MOTIVATION Testing for correlations between different sets of genomic features is a fundamental task in genomics research. However, searching for overlaps between features with existing web-based methods is complicated by the massive datasets that are routinely produced with current sequencing technologies. Fast and flexible tools are therefore required to ask complex questions of these data in an efficient manner. RESULTS This article introduces a new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets. AVAILABILITY AND IMPLEMENTATION BEDTools was written in C++. Source code and a comprehensive user manual are freely available at http://code.google.com/p/bedtools CONTACT [email protected]; [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BEDTools: The Swiss-Army Tool for Genome Feature Analysis.

Technological advances have enabled the use of DNA sequencing as a flexible tool to characterize genetic variation and to measure the activity of diverse cellular phenomena such as gene isoform expression and transcription factor binding. Extracting biological insight from the experiments enabled by these advances demands the analysis of large, multi-dimensional datasets. This unit describes th...

متن کامل

valr: Reproducible genome interval analysis in R

New tools for reproducible exploratory data analysis of large datasets are important to address the rising size and complexity of genomic data. We developed the valr R package to enable flexible and efficient genomic interval analysis. valr leverages new tools available in the "tidyverse", including dplyr. Benchmarks of valr show it performs similar to BEDtools and can be used for interactive a...

متن کامل

Pybedtools: a flexible Python library for manipulating genomic datasets and annotations

SUMMARY pybedtools is a flexible Python software library for manipulating and exploring genomic datasets in many common formats. It provides an intuitive Python interface that extends upon the popular BEDTools genome arithmetic tools. The library is well documented and efficient, and allows researchers to quickly develop simple, yet powerful scripts that enable complex genomic analyses. AVAIL...

متن کامل

Integrating information systems in electric utilities

This paper presents an integration system ISIS has developed for electric utilities, on top of which decision support tools can be cost-effectively developed and integrated with the other information system commonly present in electric utilities. The system, called the Integrated Distribution Management System (IDMS), provides an easily configurable integration framework, and currently includes...

متن کامل

The sequence manipulation suite: JavaScript programs for analyzing and formatting protein and DNA sequences.

JavaScript is an object-based scripting language that can be interpreted by most commonly used Web browsers, including Netscape® Navigator® and Internet Explorer®. In conjunction with HTML form elements, JavaScript can be used to make flexible and easy-to-use applications that can be accessed by anyone connected to the Internet (3). The Sequence Manipulation Suite (http://www.ualberta.ca/~stoth...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 26  شماره 

صفحات  -

تاریخ انتشار 2010